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We use Godel's Dialectica interpretation to analyse Nash-Williams' elegant but non-constructive 
'minimal bad sequence' proof of Higman's Lemma. The result is a concise constructive proof of 
the lemma (for arbitrary decidable well-quasi-orders) in which Nash- Williams' combinatorial idea is 
clearly present, along with an explicit program for finding an embedded pair in sequences of words. 

1 Introduction 

We call a preorder {X,<x) a well-quasi-order (WQO) if any infinite sequence (x,) has the property 
that Xi <x Xj for some / < j. The theory of WQOs contains several results which state that certain 
constructions on WQOs inherit well-quasi-orderedness, the most famous being Kruskal's tree theorem 
ifm . A special case of this theorem is Higman's lemma: 

Theorem 1 (Higman, |9]). If {X,<x) is a WQO, then so is the set {X*,<x*) of words in X under the 
embeddability relation <x*, where {xq,. . . <x* {x'q,.. . iff there is a strictly increasing map 

f: [m] — 7- [n] with xi <x x'^-^for all i < m. 

A short proof of Higman's lemma (and more generally Kruskal's theorem) was given by Nash- 
Williams [131, using an elegant but non-constructive combinatorial idea known as the minimal bad se- 
quence argument. 

Higman's lemma has attracted a great deal of attention in logic and computer science, and has been a 
focal point of research into computational aspects of classical reasoning used in infinitary combinatorics. 
The constructive content of Nash-Williams' minimal bad sequence argument has been widely analysed 
(see for instance |21|T9l), and in particular, constructive content has been extracted from the proof using 
formal methods such the A-translation (TT\ and inductive definitions f6^|. An extensive study of program 
extraction for Higman's lemma has been carried out by Berger and Seisenberger (see 1,3, iI7j|), who 
improve the aforementioned techniques and implement them in the MiNLOG system. 

In this article we give another constructive proof of Higman's lemma based on the minimal bad 
sequence argument. The novelty of our approach is that we use a technique that has not been applied 
in this context - Godel's Dialectica interpretation. The combination of the negative translation and the 
Dialectica interpretation forms an extremely powerful and efficient method for extracting programs from 
classical proofs - testament to this is its central role in the well-known proof mining program (see ifTOll ). 

The formal extraction of computational information from proofs often results in output that is com- 
plex, highly syntactic and difficult to understand in mathematical terms. However, the use of proof 
theoretic techniques to analyse the constructive content of classical reasoning is becoming increasingly 
relevant in mathematics, therefore we believe that it is important to produce case studies in which these 
techniques are applied in a transparent and intuitive manner. 
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A Constructive Proof of Higman 's Lemma 



The goal of this article is not just a new proof of Higman's lemma, but a case study that sheds 
some light on the functional interpretation of proofs in infinitary combinatorics. Our emphasis here 
is not on 'mining' the proof for quantitative information but to produce a constructive justification of 
Higman's lemma that can actually be read as a mathematical proof, and in which Nash- Williams' original 
combinatorial idea is clearly present. In addition, we give a heuristic account of the operational behaviour 
of the resulting program. 

1.1 Preliminaries 

We formalise Higman's lemma in the language PA'*' of Peano arithmetic in all finite types (see e.g. |[T1 
for details), although throughout the paper we endeavour to avoid excessive formality and make various 
syntactic shortcuts to keep things as readable as possible. By extending PA'" with the axiom of dependent 
choice 

DC : V«,x^3/A„(x,3;)^Vxo3/^^^(/(0)=xoAV«A„(/«,/(« + l))) 

over arbitrary types X, one obtains a theory of analysis capable of formalising a large portion of mathe- 
matics, including Nash- Williams' minimal bad sequence construction. 

Notation. We make use of the following conventions and abbreviations. 

• Ox denotes a canonical element of type X. 

• Because we will be confronted with a large number of variables, we often use the convention that 
when a term of type X is denoted x, sequences of terms of the same type will often be denoted in 
bold type x. 

• s*a represents the concatenation of the finite sequence s and a finite/infinite sequence a. 

• We write s ^ a when the finite sequence s is an initial segment of a finite/infinite sequence a. 

• [a] (n) is the initial segment of the infinite sequence a of size n. 

• We write a<b when a word a : X* is an initial segment of b i.e. \a\ < \b\ and a,- = bi for all / < \a\. 
If a is a prefix (|a| < \b\) we write a <ib. 

• Given two sequences of words u and v we write m ^„ v := ([m] («) = [v] («) A < v„) and m <l„ v := 
{[u] (n) = [v] (n) A M„ <1 v„) - the latter simply states that u is lexicographically less than v at point n 
with respect to the prefix relation <]. 

1.2 The functional interpretation of proofs in PA" + DC 

This article assumes familiarity with Godel's functional interpretation of classical proofs, by which we 
mean the Dialectica interpretation combined with the negative translation. We do not have space to give 
details of the interpretation - for this the reader is referred to HI. However, it is useful to recall a few 
basic facts. 

• The functional interpretation of £2 formulas coincides with the well-known no-counterexample in- 
terpretation of Kreisel, interpreting A = 3x'iyA(){x,y) as a functional F that witnesses V/3xA(x,/x). 
Intuitively F justifies A by refuting arbitrary counterexample functions / attempting to disprove A. 

• The functional interpretation interprets YI2 formulas ^x3yB{x,y) directly with a functional / satis- 
fying \lxB{x,fx), due to the fact that it admits Markov's principle. This means that we can use the 
interpretation to extract programs from even classical proofs of 112 theorems. 
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It was shown by Godel that PA® has a functional interpretation in the system T of higher-type 
primitive recursive functionals. On the other hand, system T is insufficient to interpret the combination 
of classical logic and countable choice. For this, one typically assigns a direct realizer to the negative 
translation of choice, usually some form of backward induction such as the well-known bar recursion 
devised by Spector in [HI. In this article dependent choice is interpreted using the more recent product 
of selection functions introduced in |T|. 

Definition 2. A selection function is any functional of type JrX := {X ^ R) ^ X, for arbitrary X, R. 
Given an indexed family of selection functions e: X* ^ J^X together with functionals q: X^ — > R and 
(jp : X'*' — )• N, the product of selection functions EPS is defined by the recursion schema 



where a^ = es{Xx . ^;c(EPS^jc(e)(<7.^))), qx is defined by qx{oC) := q{x*a) and sis the canonical extension 



EPS is a variant of bar recursion that makes explicit the idea that bar recursion can be viewed as kind 
of backtracking algorithm analogous to the computation of optimal strategies in games of unbounded 
length. We feel it is good practise to choose it over Spector's original bar recursion because it comes nat- 
urally equipped with this game semantics. The idea is to imagine q: X® R specifying the outcome of 
a sequential game with moves of type X and outcome of type R, the Es as selection functions that specify 
a strategy for round \s\ given that s has already been played and (p : X® ^ N as a control functional that 
indicates when the game has terminated. For further details on the EPS see [8 |. By unwinding Definition 
|2]one can prove the following key result. 



Theorem 3 (Main theorem on EPS, cf. 1 16]). Setting a:=EPS^^(£)(^) and ps :=Xx . ^i«(EPS|'«(£)(^i„)) 
solves the following system of equations 



for all n < (pa. 

As originally established by Spector, in order to witness the functional interpretation of dependent 
choice it is sufficient to solve the equations ^ given e, q and <p. Therefore a consequence of Theorem[3] 
is that EPS realizes the functional interpretation of dependent choice. For full details of the interpretation 
of choice via EPS the reader is referred to |16|. In this article however, it is enough to know that EPS 
solves ([T|l - in our interpretation of the minimal bad sequence construction an instance of these equations 
naturally arises and we will solve them directly using EPS, bypassing the formal interpretation of choice. 

The statement that X* is a WQO can be written as a YI2 sentence. By formalising the classical 
proof of Higman's lemma in PA® + DC, we guarantee in theory that given a realizer for the well-quasi- 
orderedness of X we can extract a direct realizer Y : (X* )® — > N in T + EPS that bounds the search for an 
embedded pair in an arbitiary sequence of words. We formalise the proof in Sect. [3]and extract a realizer 
r in Sect. 11 




ofs. 



= ^[a]{n){P[a]{n)) 
(jioc) = P[a]{n){0Cn) 



(1) 



2 A Classical Proof of Higman's Lemma 



We begin by presenting Nash-Williams' proof of Higman's lemma. First we need the following simple 
result. 
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A Constructive Proof of Higman 's Lemma 



Lemma 4. In a WQO {X, <x), any sequence (x,) has an infinite increasing subsequence. 

Proof. For general WQOs this is an easy consequence of Ramsey's theorem. □ 

In the following we call a sequence in a preorder X good if jc,- <x xj for some / < j. A sequence is 
bad if it is not good. X is a WQO if all sequences in X are good. 

Proof of Theorem\l\(Nash-Williams, STSjj). Suppose for contradiction that X is a WQO, but there exists 
at least one bad sequence u in (X*)'^. Then among all bad sequences we pick a minimal bad sequence as 
follows: 

1. Choose vo to be an element of X* with the property that vq is the first element of some bad sequence 
but no prefix of vq extends to a bad sequence in this way. Such an element exists by the assumption 
that we have at least one bad sequence u. 

2. Given that vo,...,v„_i have been selected, choose v„ to be an element with the property that 
Vo, . . . , v„ starts a bad sequence but vq, . . . , v„_i ,y does not extend to a bad sequence for any prefix 

y<\Vn. 

By dependent choice we can construct an infinite sequence (v,) in this manner. It is easy to see that (v,) 
must itself be bad and therefore in particular each word v,- must be non-empty, so we can write v, = v,- *Xi 
where the x, form an infinite sequence in X. 

Now by Lemma |4] the sequence (xi) has an increasing subsequence 

Xio <X Xii <X 

Consider the sequence 

VO,---,V,-g_i,V;Q,V,l,.... 

This sequence must be bad, else (v,) would be good, but v,q is a proper initial segment of v,,,, contradicting 
the minimahty of (v,) at iq. Therefore there cannot exist an initial bad sequence uinX*. □ 

3 Formalising the Classical Proof 

We now formalise Nash- Williams' proof in PA'*' + DC, so that we are ready to apply the functional 
interpretation in the next section. Given a preorder {X, <x) define the predicate 0x on X® x N by 

dxixj) := V/() < /i < j{xig ^x Xjj). 

We define the predicate dx* on (X*)'^ x N similarly. We suppress the subscript on 6 when it is clear 
which type it applies to. 

Remark 5. In this article the intuition is that the underlying WQO X consists of elements of type 0, and 
that the relation <x is decidable. Therefore -<, and 6 will all be decidable over both X and X*. 

A sequence x is bad is it satisfies the 111 predicate yj6{x,j). The preorder X is a WQO if the closed 
n2 predicate WQO[X] ■.= yx3j^dx{x,j) holds, similarly X* is a WQO if WQO[X*] ■.= yu3j^dx*{u,j) 
holds. Higman's lemma can then be formally written as 

WQO[X] ^ WQO[X*]. 
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In the proof of Higman's lemma, the hypothesis WQO [X] appears in the form given by LemmalU namely 
that any sequence in X has an infinite monotone subsequence: 

MonSeq [X] := Vx^™ Bg^^^yk^i <j< k{gi < gj A x^,- <x Xgj). (2) 

In our interpretation of Nash- Williams' proof we do not analyse the computational content of LemmalU 
rather we directly interpret 

MonSeq [X] ^ WQO [X*]. 

There are two reasons for this - the first is that in general the passage from WQO[X] to MonSeq [X] 
requires Ramsey's theorem and therefore full dependent choice, so while one could in theory interpret 
Lemma|4]using bar recursion or the product of selection functions, in this article we wish to focus on the 
main content of Nash-William's proof, so we omit these details. 

The second reason is that in certain interesting cases it is easy to prove MonSeq [X] directly, without 
resorting to Ramsey's theorem. For instance, when the underlying alphabet X is a finite set, MonSeq [X] is 
provable in PA® using the infinite pigeonhole principle, and so a realizer for the functional interpretation 
of MonSeq [X] can be given in system T. 

3.1 The Minimal Bad Sequence Argument 

Our main step in the formalisation of Nash- Williams' proof is the formalisation of his minimal bad 
sequence argument. The main non-trivial principle of PA'" we require is the least element principle - 

LEP : 3mA(m) ^3m'(A(m')A^A(m'-l)), 

where in our version we assume that A is monotone in the sense that it satisfies (/) i < j ^ (^(0 -^^{j)) 
and (//) ^A(O). 

Lemma 6 (Minimal bad sequence construction). It it provable in PA® + DC that for any sequence of 
words u: (X*)®, there exists a sequence p„ = p^jp',... of sequences of type (X*)® and a sequence 
f„ = l*,f^, . . . of functions of type (X*)® N, which, defining p^^ := u, together satisfy the following 
sentences: 

V«([p"-i](«) = [p"](«)); (3) 

V«,7(-0(p",7)^-0(p""\7)); (4) 
yn,q^''*^\q<n^"^^d{q,fq)). (5) 

This formulation of the minimal bad sequence construction is a little more intricate than that given in 
Sect. 121 in particular our aim is to highlight the computational aspects of the construction. The intuition 
is that the sequence p„ is classically constructed in the following manner: 

1. Given an initial sequence u, we choose p" to be a bad sequence such that pj] < uq but no y <l pjj 
extends to a bad sequence. If no prefix of uq extends to a bad sequence we set p" := m. 

2. Given that we have constructed p"^^ we choose p" to be a bad extension of [p"^'](«) such that 
[p"](?i) *y does not extend to a bad sequence for any y <l p^. If no such bad extension exists, we 
set p" ■=p"-\ 
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If p„ is defined in this way tlien it clearly satisfies ([3]), and for each p" we can produce a (classically 
constructed) function f that witnesses the minimality of p" in the sense of Q. 

We observe that the p" are not necessarily bad (in fact if X is a WQO they never will be), but the 
point is that p" only fails to be bad in the event that p"^^ is good, in which case we must have p" = p"^^ 
This is the intuition behind (01). Nash- Williams' proof is based on the fact that if X is a WQO then by ^ 
we can show that there is some n and j such that d{'p",j) fails, and then by induction over @ we must 
have -i0(m,7). 

Proof of Lemma^ Suppose for the moment that n and w^^'^"" are fixed. Define the monotone predicate 

A{m) := 3r(^*)™V/|A„,|- where 

|A„,|[:=r<„wA|r„| <mA{d{w,i) ^ d{r,i)). 

It is clear that A{m) is monotone, and that V/|A|n,^|^i holds. Therefore by LEP there exists some m' 
such that 

3pyj{p<nW/\\pn\<m/\{diw,j)^dip,j)))A 
yc]3k{q<nWA\qn\ < m' - I ^ {e{w,k) A^d{q,k))) ' 

Now, observing that if p <i„ w A |p„ | < m' then q <„ p ^ q A\q„\ < m' — 1 we can prove in PA® that 
© imphes 

3p (Vj ( [w] {n) = [p]{n)A{e{wJ)^d{p,j)))A VqBk {q p ^ ^e{q,k))) . (7) 

Skolemizing (|7]) we have that for arbitrary n, w, there exists a sequence p and function /: {X*)'" N 
satisfying 

Vj, q ( M (n) = [p] {n) A (0 (w, j) ^d{pJ))A{q <„ p ^^d{q,fq))). (8) 

By DC of type (X*)'*' x {(X*)"^ — )• N) applied to ^ (only dependent on the sequence part of the previous 
entry), defining an initial value p^' := u there exists an infinite sequence of sequences p„ = \)^,\)^ . . . and 
functions f„ = l^jf^ . . . satisfying 

V«,7,<?([p"-i](n) = [p"]W A (0(p"-i,j) ^ 0(p",j)) A(^<„p" ^ ^e{qXq))). (9) 

This completes the proof, as ([S]), (01) and ([5]) clearly follow from (|9l). □ 

In the following MB[X*] abbreviates the statement that for all u there exists p„ and f„ satisfying 

3.2 Completing the Proof 

Notation. Given a non-empty word x: X* we write x = x*x where x: X* and x: X. So that these are 
well defined for all x, we define () := () and () = Ox. Given a sequence of p: ((X*)'")® we define the 
diagonal sequences p : (X*)® by p,- := p|. and p : X® by p,- := pj. 
Theorem 7. It is provable in PA'" that MonSeq[X] A MB[X*] WQO[X*]. 

Proof Take an arbitrary sequence u: (X*)®. By MB[X*] there exists p„ and f„ satisfying (l3][5]l. We 
show that one of the p' must be good, which by ^ implies that u must also be good. 

By l\/lonSeq[X] applied to p there exists a monotone function g such that Pg,- <x pgj for all / < j. 
Define 



Thomas Powell 



55 



LEP DC ^ ^ 

: Lem. |6] = Thm. [T] 



MB[X*] MonSeq[X] AMB[X*] ^ WQO[X* 

MonSeq[X] ^ WQO[X*] 

Figure 1: Structure of Nash- Williams' proof. 



Now either p^Q is empty (and hence p^'^ is trivially good) or p^o < P^q thus Y'^gO P^'^> which by ^ 
implies that -^O^Yj^^Y) i-^- the sequence 

[yW\+1) = Pit ^ • ■ • ' Pfo- 1 , Pgo , Pgi , ■ • ■ , Pg{&^v-gO) 
has one word contained in a later one. But by construction of g this implies that the sequence 

„gO-l sO-l go gO+l s{O-g0) g{P°V-gO)+l (^ 

Po '•■•'PgO-l'PgO'PgO-l'---'Pg(fsO|^_gO)'Pg(fsV-gO)+l 

has one element contained in a later one (note that J<x* x <x* y unless \x\ = 1 and \y\ = 0, which is 
why we need to add the extra element at the end of (*)). But by the nesting property (*) is just an initial 
segment of p^'^* V-?o)+i^ which must therefore be good. This completes the proof. □ 

Combining Theorem |7] with Lemma [6] we see that MonSeq[Z*] WQO[X*] can be formalised in 
PA® + DC. The proof as a whole is illustrated in Fig. [T] 



3.3 Computational Aspects of Nash- Williams' Proof 

Now that we have formalised Nash- Williams', we pause for a moment before the full program extraction 
to look at the computational hints contained in the classical proof. Assuming a realizer g for MonSeq [X], 
given an arbitrary sequence of words u: {X*)"' suppose we construct Pu, f„ as in Lemma [6] and the 
sequence y as in the proof of Theorem |7] 

By inspecting the proof of Theorem |7J it is not too difficult to show that there exists /q < i\ < 
such that <x* Ui^ , where 

(/)(«) :=g(fgV) + l. 

To see this, note that we prove that -i0(p^(^''°'''^^*')+\g(P"i/A — gO) + 1) and so therefore we also have 
^0(M,g(f«*'i//--gO) + 1) by (© and hence -^d{u,${u)) smce g IS monotone. 

Now <p{u) is clearly an ineffective bound for Higman's lemma, as it depends on non-constructive 
objects g, p„ and f„. However, in order to verify the correctness of <p{u), we do not need the whole of 
these objects. Rather 

• g must satisfy (O up to A: = P^Y' 

• Pu, fu must satisfy ([3][5l) up to n = 

Therefore, if we have a procedure that will compute approximations to these objects up to a finite 
point parametrised by those objects themselves, we can turn into an effective bound for Higman's 
lemma. This is precisely what the functional interpretation does. 
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4 A Constructive Proof of Higman 's Lemma 

We now build our constructive version of Nash- Williams' proof. This section follows closely the struc- 
ture of Sect. [3l Recall that we assume a realizer for the functional interpretation of MonSeq[X], namely 
a functional G: X'*' ^ ((N^ ^ N) ^ (N ^ N)) satisfying (cf. ©) 

V^™,<p^'^^V/ < j < (p{G^^){G^^ < G^^jAxG^,,- <xXG^,j). (10) 

In general, such a realizer could be obtained from a realizer of WQO[X] by implementing a computa- 
tional interpretation of Ramsey's theorem - such as the one given in 1 15| using the product of selection 
functions. However, when X is finite, G can be given directly using the standard interpretation of the 
infinite pigeonhole principle found in e.g. 1141 . 



4.1 Interpreting the Minimal Bad Sequence Argument 

The central part of our constructive proof is the following, constructive version of Lemma [6l which is 
just a realizer for the functional interpretation of I\/1B[Z*]. 

Notation. Recall (Sect. 11.21) that we denote the type of a selection function by JrX := {X ^ R) ^ X. 
We use the abbreviation Y = (X*)® x ((X*)® — > N) for the type of our choice sequence. Also, in what 
follows it will be useful to implicitly write variables F : A — > 5 x C as pairs {F^^^ ,F^^'^) - this slight 
abuse of types will make our syntax much more intuitive. 

Lemma 8 (Minimal bad sequence construction). For fixed n and w'^'^*^'^ define the decidable formula 

\AT\'iby 

\A!^f\'i■.= r<nW^\rn\<m^^{r,i), 
which is slightly simpler than that used in the proof of Lemma^^ Define the functionals 

by 

where i < |w,j| is the greatest integer satisfying ~'\^l'^tf'^{^Q[p- f^)) '^^^ the finite sequences po,--- tP\w„\ 
/o, . . . ,/iw„i cire defined recursively by 



fo 
f 

Pi-i 



■ w 

QiPiJi)- 



(12) 



Now, given an arbitrary sequence u : (X*)®, define the family of selection functions e": Y* ^ ]fqy,(^x*)«>Y 
by 

^"p^F)i'^^Q) ■= ^|{P,F)|,pK''f)l-i(-')0' (13) 



'it would have been sufficient, although less direct, to obtain Q in the proof of Lemma[6]by applying LEP to this simpler 
formula. We opt for this variant now to simplify the subsequent constructions, as either version would result in essentially the 
same program. 
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where we define the initial value P . = u. Now, given counterexample fiinctionals n,<I>: — t- N and 
^■.y^ ^ {X*)^, the sequences 

p„,f„:=EPSg(£")((<I>,»F)) 
satisfy, defining := u, the following sentences (cf. dUQj.- 

V«<np,f([p"-i](«) = [p«](n)); (14) 
V« < np,f(-0(p",<I>p,f) ^ -0(p"-\<I>p,f)); (15) 

v« < np.f(»Pp.f <„p" ^ -e(»Pp,f,f (»Fp,f))). (16) 

These sequences pj,, f„ computed via the product of selection functions interpret the instance of DC 
used in the minimal bad sequence construction, and witness the no-counterexample interpretation of 
MB[Z*]. The functional Q. determines how large the approximation to the choice sequence is, and <I>, *F 
in some sense calibrate its depth. 

Our aim in the next section is to pick suitable counterexample functions such that ( [T6l ) implies 
-i0(p",<I>p^f) for some n < Hp f, then by induction over ([T5] ) we have 

-0(p«,<Dp,f) ^ -0(p \<I>p,f)) = -0(M,<I>p,f), 

and we therefore obtain 3/o < 'i < *J*p„,f„("!o —X* "r'l) i-C- ^ constructive bound for u being good. First 
we must prove the lemma. 

Proof of Lemma^ First, we show that £„ witnesses the functional (i.e. no-counterexample) interpre- 
tation of (H]), in the sense that given counterexample functions J,Q : 7 — )• N x (X*)® for j,q we have 

(suppressing dependencies and writing £^ = e^„{{J ^Q))) 

M («) = [£*'](«) A (0(w,7£)^0(£°,7£)) A (!2£<„£"^-e(Ge,e^(Ge))). (17) 

The following is a constructive version of the proof of Lemma[6] Let < / < jw^l be the greatest number 
such that ''/,))' t)y definition we have (£*',£^) = {pi,f). There are two cases. 

Case 1: i = \wn\- Then we have 

^\^M\fm) = 2£ A |(e£)„| <\wn\^ ^d{Qe,e\Qe)). 
Therefore, observing that £" = p\„^^\ := w and (Qe) <„£*' — > (Qe) A\{Qe)n\ < \wn\, we easily obtain 

Case 2: i < |w„|. By maximality of /, must be true. Now looking at the defining 

equations we have Q{pi+\,fi+i) = pi = £" and fi+i{Q{pi+\Ji+\)) = f+i{pi) = J{PiJi) = Js, 
therefore the following two formulas are true: 

\Ai+i\i = e^<„wA\{eW<iAd{e^,Jey, (18) 
^\^'\?m = Q^^nwAliQeU < i^^d{Qe,e\Qe)). (19) 

Now by ([T8]l we have [w]{n) = [£*'](«) A {d{w,Je) 0(£°,7£)), and because Qe <„ £" Qe w A 
\{Qe)n\ </ by ([Till we obtain Qe <„ £" -^6 {Qe , e\Qe)) . Therefore ^ holds. 
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Thus we have shown that e„ h, witnesses (TTl for arbitrary n,w,J and Q. Now setting 

yCD 

p„,f„:=EPSg(£")((<J>,»I')) 



MpJ) := <J>([p„](.),[f„](«)>*(.,/)(EPSSp„K„),[f„](«))*(.,/)(^")((*f '^))) (20) 
QnipJ) := ^([p„](.),[f„](«))*(.,/)(EPSjp„](„),[y(„))^,,,)(£")((0,^))) 
by the main theorem on EPS quoted in Sect. I1.2l we satisfy Spector's equations 

p",f' = £^p„-i{Jn,Qn),£n^pii-iiJn,Qn) 



(21) 



for all n < Q.ps- By setting w := p" ^ J := 7„ and Q := 2„ in (fTTl) and substituting in (|2TI ). we obtain 
equations (1 14111 6b . □ 

4.2 Constructing a Realizer for Higman 's Lemma 

Definition 9. G/vf'?! a pair of sequences p: ((X*)®)'*' and f: {{X*)"' ^ N)"', let Gp{ be a realizer for 
MonSeq[X] on the sequence (p, ) and counterexample function 

<Pp,f:=Ag.i««([p^°-^](gO)*(p,,-)). 
Define the functionals D., <I> and *F by (suppressing the subscript on G, (p) 



a(p,f) 
<J>(p,f) 
*I'(p,f) 



G(<pG) + l, 
G(<?)G) + 1, 

[P^°->](GO)*(pgO. 



T{u) :=0(p„,f„), 

w/iere p„,f„ := EPSg(£")((<I>,»F)) with e" defined as in Lemma\8l 

The main theorem of this article is the following, constructive analogue of Theorem |7J 

Tlieorem 10 (Higman's lemma, constructive version). Suppose X is a WQO. Then for all sequences of 
words u: X* over X we have 

3io < ii < r{u){ui^ <x* Ui^) 
where T is constructed as in Definition [9] 

Proof. Fix u. In what follows, p, f are fixed as p„, f„. We use the abbreviation n„ := n(p„,f„), and 
similarly for <!>„, Gu and (pu- We claim that there is some n < Q.^ satisfying -i0(p", <!>„). Then by 
induction over (fTSl) . we see that <!>„), and the theorem follows from the definition of 6. It remains 
to prove the claim. 

First observe that because Gu is a realizer of MonSeq[X] for % we have (cf. [TOl) 

V/ < j < (Pu{Gu){Gui < GJ A Pgj <x Pgj) ■ (22) 
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e: LEP EPS: DC ^ „ 
= Lem. ^ = Thm. \m\ 

EPS(e): MB[X*] MonSeq[X] A MB[X*] ^ WQO[;C*] 

AG . Xu . <I>G(EPS5?(e)((*,'i')) : MonSeq[X] ^ WQO[;C*] 



Figure 2: Structure of constructive proof. 



Now, GO < G{q)G) so we have GO < G{(pG) + 1 = Q.u, therefore by ^ it follows that 

^u<Gop''''^^e{^>,„f^\^>,)). (23) 

The premise of (|23]) must hold by construction of *F„, since [p'^°"^](GO) = [p'^^]{n) by ([T4l) and pnn < p^" 
(unless p^Q = () in which case we trivially have -i0(p*^*',GO+ 1) and hence -i0(p'^*', <!>„)). Therefore 
we have -'d{^u,(pG) since = <p„G„ by definition, i.e. the finite sequence 

[»F„](<pG + 1) ^ . .,P^tlpGO, ■ . ■ ,Pg(9G-G0) 

has one element contained in a later one (we illustrate the case (pG>GO- if (pG < GO then [p'^°-i](GO) 
is bad and hence -i0(p'^*'^\<I>„)). Now since (pG — GO < (pG, by (l22l) we see that the sequence 

_G0-1 „G0-1 ri^O-l „G0 riGO+1 G{(pG-GO) G{(pG-GO)+l / ^ 

PO 'Pi ' • • • 'PgO-1'PGO;PgO+1' • • • 'Pg{(pG-G0)'Pg{(pG-G0)+1 V"^/ 

has one element contained in a later one (we need to add an extra element for the same reason as we do in 
the proof of TheoremlT]). But because G{(pG - GO) + 1 < G{(pG) + 1 = n„, by the nesting property 
the sequence (*) is just an initial segment of p<^('?"^-<^'')+i, and hence -.0(p'^(''"^"<^°)+\ G((pG - GO) + 1) 
which implies ^0(p<^('P<^-<^")+\<I>„). This proves the claim, completing the proof. □ 

An rough map of our constructive proof, with partial realizers shown is given as Fig. |2] 



4.3 An Informal Discussion on the Extracted Program T 

We conclude the section with an informal analysis of our extracted realizer. Often, programs extracted 
from classical proofs via proof interpretations can be very difficult to understand, sometimes taking up 
several pages of abstruse higher type syntax or computer code to even state. In contrast, given the logical 
complexity of Nash-Williams' proof our realizer extracted using the Dialectica interpretation is relatively 
concise, and we can even describe its operational behaviour to an extent. 

We stress that everything which follows is heuristic and has not been properly formalised. Our aim 
is merely to illustrate that it is at least feasible to decipher our realizer on a qualitative level! 

Our algorithm uses the product of selection functions EPS to interpret the minimal bad sequence 
argument used in Nash- Williams' proof. As observed in Sect. 11.21 EPS - and consequently our extracted 
program - comes equipped with a natural game theoretic semantics. For a full account of this the reader 
is advised to consult |l8][T6l. However, for completeness we state, without further details, the game 
theoretic reading of the key constructions in our algorithm. 

• The functionals <I>,*F assign to any sequence (i.e. infinite play) p,f an outcome of type N x (X*)®. 

• The selection functions e" - built from the realizer of LEP - implement a strategy for constructing 
an optimal play Pu,fu, the selection function £^p„-i being responsible for constructing the nth point 
p'jj,^ in the sequence given that we have already computed the previous value p„^^ 
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• The selection functions make a decision based on the functionals 7„, 2„ defined in ( |20l ) which (in 
loose game theoretic terms) describe the optimal outcome of each potential choice at point n. 

• The functional Q acts as a control, determining the 'relevant part' of an infinite play p,f thereby 
telling EPS when it has computed a sufficiently long sequence. 

In terms of Nash- Williams proof, the sequence p„,f„ strategically constructed by EPS constitutes an 
'attempt' at producing a minimal bad sequence from u (given by p, with accompanying functionals f" 
witnessing minimality at point n). We define *P and Q. so that the construction can be essentially 
reversed to obtain a bound for u. 

So what can we say about this optimal sequence Pi,,f„? We prove in Theorem [TO] that there is some 
element of the approximation pjj such that -i0(p'^,<I>„) holds. It is not too difficult to see, by (|2TI ). that 
-i0(p[J,<I>„) can only hold if £„pn-i picks the default value pi = p",^^. Similarly we have p",^^ = pjj^^ 
and so on, so EPS just returns the initial value u at each step. 

So how does the program justify selecting u at point n, given that it has already chosen ua.tn — 17 
We see that the selection function £„„ always sets (p",f|]) = (where the fi are defined as in 

([T2I)). unless the outcome Qn{u,f\u„\) = is lexicographically less than u at point n, in which case it 
must check that j]„„|(»F„)) is false. But ^„„|(»F„) = 7„(»F„,yj„„|_i) by ^ which checks the final 
outcome of EPS given the sequence 

(«,0,...,(«,l^-^),(»F„,4„l_i) (*) 

Now in the computation of EPS the functionals Q., 0, *P only ever look at the first / values of p'^^ 
Therefore we propose that because = [u\{n) (and |m„| — 1 = K^I'h),!!) we can identify (*) with the 

outcome of EPS given the sequence 

(»F„,l*J,...,(»P„,f«^-;;i),(»F„,/|(^„)„l) (24) 

which by our previous argument can be viewed as the outcome of running our algorithm with initial 
value *P„ instead of u. In other words we make the identification Jn{^u,f\un\-\) ~ *J*'p„ = r(*F„), which 
explains why we must have -.0(%,7„(»F„,/j„,_|_i)). 

We claim that the algorithm T obtained via EPS has characteristics of an open recursion procedure 
(see e.g. f2]), computing T{u) by internally computing values of r(v) for v lexicographically less than 
u. If we take p„ to be the constant sequence with value u, then our bound for u is given by T{u) := 
<I>(p„,f„) = G{(pG) + 1 where now G is a witness for MonSeq[X] on U and counterexample function 
Xg .P^{[u]{gO)*{iigi)). But by our previous argument we can identify f®''([M](gO)*(Mg,)) with r( [m] (^0) * 
(wg,)). Thus it seems that F is closely related to a functional F defined, via open recursion, by r{u) := 
G{(pG) + 1 where G is a witness for MonSeq[X] on the counterexample function 

(p:=?ig.r{[u]{gO)*iugi)). 

Of course, none of this precise - the identifications above are made very informally - and in particular 
we anticipate that the way our algorithm treats empty words would be more complex than a straightfor- 
ward open recursion procedure. However, our purpose here is merely to provide via a casual argument 
some insight into how F works. 

It would be interesting to analyse the behaviour of our extracted algorithm in depth, to give a precise 
explanation of the way in which it computes bounds on bad sequences and compare this algorithm to 
those extracted using other methods. We leave this as an open problem. 
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5 Final Comments 

We have used Godel's functional interpretation to produce a constructive version of Nash- Williams' 
minimal bad sequence proof of Higman's lemma. Our proof is relatively short and concise, and the 
combinatorial idea behind Nash-William's proof can be clearly seen in ours. Moreover, we can start to 
make sense of the operational behaviour of the extracted algorithm, at least on an informal level. We 
hope that this case study provides some insight into program extraction in infinitary combinatorics using 
the functional interpretation. 

An obvious direction of future work is to better understand our realizer and give a more satisfactory 
description than that given in the previous section! One could potentially refine our realizer so that it is 
more intuitive and efficient, or alternatively construct a new realizer that directly interprets the functional 
interpretation of the minimal bad sequence argument and compare how it behaves to the one given here. 
It would also be instructive to formalise our program extraction in a theorem prover, and actually run the 
algorithm F on some concrete WQOs to analyse its behaviour 

We close with the remark that the ideas in this article could be extended to solve the functional in- 
terpretation of the general minimal bad sequence construction, and thereby extract programs from more 
complex proofs that use this construction, such as Kruskal's theorem. While our focus in this article 
was on the qualitative aspects of program extraction, it is natural to ask whether one could obtain useful 
quantitative information from the analysis of proofs in this area of combinatorics. Bounds for Higman's 
lemma on a finite alphabet have already been produced using more direct methods e.g. [4J, but it would 
be interesting to see if any useful constructive information could be extracted in the general case or for 
related theorems, through the formal analysis of proofs. 
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